feat(huggingFace): add HuggingFaceModelResource for model browsing and media proxy by PG1204 · Pull Request #5124 · apache/texera

PG1204 · 2026-05-17T21:33:40Z

What changes were proposed in this PR?

Introduces HuggingFaceModelResource - a Jersey REST resource at /api/huggingface/* that backs the upcoming HuggingFace operator's model picker, audio upload, and media preview UI. Five endpoints:

Endpoint	Purpose
`GET /api/huggingface/models?task=…[&search=…]`	Browse or search HF models
`GET /api/huggingface/tasks`	List HF pipeline tags with hosted inference
`POST /api/huggingface/upload-audio?filename=…`	Stream-upload audio files
`GET /api/huggingface/audio-preview?path=…`	Stream uploaded audio back
`GET /api/huggingface/media-proxy?url=…`	Proxy allowlisted remote media URLs (CORS bypass)

Plus a single-line registration of the resource in TexeraWebApplication.

Architectural notes:

Token sourcing: the user's HF token arrives via the X-HF-Token request header (forwarded by the frontend from the operator's property panel in a follow-up PR). When absent, requests go to HF Hub anonymously. There is no server-side env-var token.
Caching: bounded Guava Cache (size + TTL) for /models and /tasks results. User-token requests bypass the cache to avoid serving one user's token-scoped list to another.
Streaming upload: /upload-audio reads InputStream straight to disk in 8 KB chunks with a 25 MiB cap (returns 413 on exceedance) - the request body is never buffered in memory. Extension allowlist rejects non-audio types up front.
SSRF protection: /media-proxy requires the URL's host to be in an allowlist (HF, fal.media, replicate.delivery/com) with a leading-dot suffix guard against lookalike domains.
Bounded fan-out: /tasks uses a dedicated ForkJoinPool(4) for its per-task probe instead of the JVM's global common pool, with explicit 429/503 detection that logs at WARN.
Truncation visibility: browse and search responses carry an X-Texera-Truncated: true header when results were capped, so the frontend can show "list incomplete" hints.
Error responses: generic Jackson-built JSON bodies (no exception internals leak to clients); details are logged server-side.

Any related issues, documentation, or discussions?

Tracked in #5134 & #5041(umbrella issue for the HuggingFace operator end-to-end implementation). This PR is the backend foundation; subsequent PRs will add the operator class, frontend property panel, result-panel media rendering, and developer documentation.

Closes #5134

How was this PR tested?

Unit tests: amber/src/test/scala/.../HuggingFaceModelResourceSpec.scala - 86 ScalaTest cases covering token sanitization, SSRF allowlist (including lookalike-domain rejection), JSON error escaping, MIME type inference, the audio-upload validation/size-cap/extension paths, audio-preview path validation and traversal rejection, media-proxy rejection paths, cache hit/bypass semantics, and the temp-dir sweep. Run with sbt 'WorkflowExecutionService/testOnly org.apache.texera.web.resource.HuggingFaceModelResourceSpec' - all 86 pass in ~6 seconds, no external network required.
Manual smoke tests against a local backend:
- GET /api/huggingface/tasks returns the expected JSON task list.
- GET /api/huggingface/models?task=text-generation returns the paginated model list; text-generation shows the X-Texera-Truncated: true header when MAX_PAGES=50 is hit.
- POST /upload-audio?filename=evil.sh → 400 (extension allowlist).
- POST /upload-audio with a 30 MiB body → 413 (size cap).
- GET /media-proxy?url=http://localhost:8080/ → 403 (SSRF allowlist).

Was this PR authored or co-authored using generative AI tooling?

Co-authored with Claude Opus 4.7 in compliance with ASF

…d media proxy Introduces a new Jersey REST resource exposing endpoints used by the upcoming HuggingFace operator UI: - GET /api/huggingface/models — browse / search models per task - GET /api/huggingface/tasks — list HF pipeline tags with hosted inference - POST /api/huggingface/upload-audio — upload audio for HF audio tasks - GET /api/huggingface/audio-preview — stream uploaded audio (path-validated) - GET /api/huggingface/media-proxy — proxy remote media URLs to bypass CORS This is the first PR in a stacked series landing the HF operator end-to-end. No operator code yet; this resource is independently useful and lets the frontend integrate with HF before the operator class lands.

PG1204 · 2026-05-17T21:39:22Z

/request-review @Ma77Ball

codecov-commenter · 2026-05-17T21:44:29Z

Codecov Report

❌ Patch coverage is 66.85393% with 118 lines in your changes missing coverage. Please review.
✅ Project coverage is 47.85%. Comparing base (953e2c4) to head (2b852ae).
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
...texera/web/resource/HuggingFaceModelResource.scala	67.04%	90 Missing and 27 partials ⚠️
...a/org/apache/texera/web/TexeraWebApplication.scala	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #5124      +/-   ##
============================================
- Coverage     49.16%   47.85%   -1.31%     
- Complexity     2384     2401      +17     
============================================
  Files          1051     1043       -8     
  Lines         40350    40261      -89     
  Branches       4279     4302      +23     
============================================
- Hits          19837    19268     -569     
- Misses        19353    19834     +481     
+ Partials       1160     1159       -1

Flag	Coverage Δ		*Carryforward flag
access-control-service	`39.41% <ø> (-2.49%)`	⬇️	Carriedforward from 5e95bcd
agent-service	`33.76% <ø> (ø)`		Carriedforward from 5e95bcd
amber	`51.96% <66.85%> (+0.30%)`	⬆️
computing-unit-managing-service	`0.00% <ø> (ø)`		Carriedforward from 5e95bcd
config-service	`0.00% <ø> (ø)`		Carriedforward from 5e95bcd
file-service	`32.09% <ø> (-6.33%)`	⬇️	Carriedforward from 5e95bcd
frontend	`37.93% <ø> (-3.15%)`	⬇️	Carriedforward from 5e95bcd
python	`90.50% <ø> (-0.30%)`	⬇️	Carriedforward from 5e95bcd
workflow-compiling-service	`56.81% <ø> (ø)`		Carriedforward from 5e95bcd

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Yicong-Huang · 2026-05-18T04:33:58Z

@PG1204 Thanks for opening this PR! Please do the following:

please follow our PR template and make the description concise.
please make sure your code meets the test coverage.
please use issues to describe future plans such as stacked PRs. This is because each PR after merge will become immutable. Issues can hold information that is longer than a PR's life cycle, and can subject to updates. If you are planning for opening multiple PRs, I suggest you use an umbrella issue to contain multiple sub issues, each for one PR.
you can use /request-review @xxx to request reviewer for review.

PG1204 · 2026-05-18T04:39:04Z

@Yicong-Huang

Thank you for the suggestions. Will update the PR accordingly.

Ma77Ball · 2026-05-18T19:30:44Z

Hi @PG1204, while I begin my review, please address @Yicong-Huang's feedback. Specifically:

Update the PR description to follow this template exactly:

   ### What changes were proposed in this PR?
   ...
   ### Any related issues, documentation, or discussions?
   ...
   ### How was this PR tested?
   ...
   ### Was this PR authored or co-authored using generative AI tooling?
   ...

Add test coverage for as much of the new code as possible. At a minimum, please cover the main features and call paths introduced here.
Relocate the overall PR plan to the parent issue, and keep this PR's description scoped to the code changes it actually contains.
Document any architectural changes. If this PR modifies the architecture, please describe what changed and where, so reviewers can follow the design intent.

Thanks, and looking forward to the updates!

Ma77Ball

Please review and resolve the comments and ask any questions as needed.

PG1204 · 2026-05-20T03:11:07Z

/request-review @Ma77Ball requesting re-review for the changes.

Ma77Ball

LGTM!

Note

Suggestions above that were not resolved should be resolved in the upcoming PRs. Also, test cases should be added in future PRs to address the missing lines reported by codecov.

Ma77Ball · 2026-05-27T11:15:36Z

/request-review @xuang7

xuang7

The PR looks good overall. I left two comments. Please also resolve any existing comments if they can be addressed in this PR, and mark them as resolved.

Addresses xuang7's review on PR apache#5124 — both endpoints previously buffered the full payload into a heap-resident byte[] with no upper bound, leaving the JVM open to OOM on a hostile or buggy upstream response (/media-proxy) or out-of-band write into the audio temp dir (/audio-preview). - /media-proxy: switch from Unirest.asBytes() to asObject(Function<RawResponse, T>), streaming the upstream body in 8 KiB chunks with a running byte counter. Aborts with 413 if the declared Content-Length exceeds the cap (pre-check) or if the body crosses the cap mid-read (defends against missing/lying Content-Length). New MAX_MEDIA_PROXY_BYTES = 50 MiB, sized for HF inference media (text-to-image ~5 MiB, text-to-video ~30 MiB) with headroom. - /audio-preview: add Files.size() defense-in-depth check before readAllBytes. /upload-audio already enforces MAX_AUDIO_BYTES on ingest; this catches the case where a bug or out-of-band write puts an oversized file in the temp dir. Adds a spec covering the audio-preview cap using a sparse-file fixture so the test stays fast (87/87 spec passes). The media-proxy cap path is exercised via the existing input-validation suite plus the new streamMediaWithCap helper - a follow-up can add a fake-RawResponse unit test if reviewers want explicit coverage of the chunked-read cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@JsonProperty

…eration Splits the monolithic 1,278-line HuggingFaceInferenceOpDesc from the team's feature branch into a dispatcher + per-task codegen architecture and ships the first task family (text-generation) end-to-end. - TaskCodegen trait + CodegenContext model the per-task variation - PythonCodegenBase emits the shared provider-fallback / process_table / _parse_response infrastructure with two holes for the per-task payload and parse snippets - TextGenCodegen supplies text-generation's chat-completions payload and the body["choices"][0]["message"]["content"] parse branch - HuggingFaceInferenceOpDesc becomes a thin dispatcher (~180 lines) holding @JsonProperty fields and the registeredCodegens map User-input string fields are typed as EncodableString and emitted via the pyb"..." macro so values reach Python as self.decode_python_template('<base64>') rather than raw literals; class constants are assigned in open(self) so self is in scope for the decode call. Generated process_table runs a defensive _HF_MODEL_ID_PATTERN check at runtime before any HF URL is composed. PR 2 of a stacked 9-PR series. PR 1 (apache#5124) ships the supporting REST resource; PRs 3-5 will add image, audio + media-gen, and QA/ranking task families by registering new *Codegen objects in the dispatcher. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

PG1204 · 2026-05-28T18:27:50Z

@Ma77Ball Would you prefer that I resolve the conversations or you'd rather resolve them. If any of the comments still require work, I shall work on them and update the PR.

@JsonProperty

…eration Splits the monolithic 1,278-line HuggingFaceInferenceOpDesc from the team's feature branch into a dispatcher + per-task codegen architecture and ships the first task family (text-generation) end-to-end. - TaskCodegen trait + CodegenContext model the per-task variation - PythonCodegenBase emits the shared provider-fallback / process_table / _parse_response infrastructure with two holes for the per-task payload and parse snippets - TextGenCodegen supplies text-generation's chat-completions payload and the body["choices"][0]["message"]["content"] parse branch - HuggingFaceInferenceOpDesc becomes a thin dispatcher (~180 lines) holding @JsonProperty fields and the registeredCodegens map User-input string fields are typed as EncodableString and emitted via the pyb"..." macro so values reach Python as self.decode_python_template('<base64>') rather than raw literals; class constants are assigned in open(self) so self is in scope for the decode call. Generated process_table runs a defensive _HF_MODEL_ID_PATTERN check at runtime before any HF URL is composed. PR 2 of a stacked 9-PR series. PR 1 (apache#5124) ships the supporting REST resource; PRs 3-5 will add image, audio + media-gen, and QA/ranking task families by registering new *Codegen objects in the dispatcher. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

xuang7

LGTM!

@RolesAllowed

Per review on apache#5124 (xuang7, Ma77Ball): mark the resource with @RolesAllowed(Array("REGULAR", "ADMIN")) to document that all five endpoints require an authenticated user. The annotation isn't enforced yet — that's coming with the auth-enforcement PR @Yicong-Huang and @Ma77Ball are working on — but adding it now means no follow-up change is needed when enforcement lands, and it matches the convention used by UserConfigResource / AdminSettingsResource. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@JsonProperty

…eration Splits the monolithic 1,278-line HuggingFaceInferenceOpDesc from the team's feature branch into a dispatcher + per-task codegen architecture and ships the first task family (text-generation) end-to-end. - TaskCodegen trait + CodegenContext model the per-task variation - PythonCodegenBase emits the shared provider-fallback / process_table / _parse_response infrastructure with two holes for the per-task payload and parse snippets - TextGenCodegen supplies text-generation's chat-completions payload and the body["choices"][0]["message"]["content"] parse branch - HuggingFaceInferenceOpDesc becomes a thin dispatcher (~180 lines) holding @JsonProperty fields and the registeredCodegens map User-input string fields are typed as EncodableString and emitted via the pyb"..." macro so values reach Python as self.decode_python_template('<base64>') rather than raw literals; class constants are assigned in open(self) so self is in scope for the decode call. Generated process_table runs a defensive _HF_MODEL_ID_PATTERN check at runtime before any HF URL is composed. PR 2 of a stacked 9-PR series. PR 1 (apache#5124) ships the supporting REST resource; PRs 3-5 will add image, audio + media-gen, and QA/ranking task families by registering new *Codegen objects in the dispatcher. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions Bot added the engine label May 17, 2026

github-actions Bot assigned PG1204 May 17, 2026

Ma77Ball suggested changes May 18, 2026

View reviewed changes

fix: address review feedback on HuggingFaceModelResource

935ccc1

github-actions Bot requested a review from Ma77Ball May 19, 2026 23:46

PG1204 mentioned this pull request May 20, 2026

Add HuggingFaceModelResource REST endpoints for HF operator UI #5134

Open

6 tasks

Merge branch 'apache:main' into hf/01-backend-skeleton

089c3c4

Ma77Ball mentioned this pull request May 26, 2026

Add Hugging Face inference operator #5041

Open

Merge branch 'apache:main' into hf/01-backend-skeleton

2aa865c

Ma77Ball approved these changes May 27, 2026

View reviewed changes

xuang7 requested changes May 28, 2026

View reviewed changes

Comment thread amber/src/main/scala/org/apache/texera/web/resource/HuggingFaceModelResource.scala

Comment thread amber/src/main/scala/org/apache/texera/web/resource/HuggingFaceModelResource.scala

PG1204 and others added 2 commits May 28, 2026 07:01

Merge branch 'apache:main' into hf/01-backend-skeleton

0c30beb

chore: retrigger CI

6857e34

This was referenced May 28, 2026

Add HuggingFaceInferenceOpDesc with dispatcher + per-task codegen architecture (text-generation) #5277

Open

feat(huggingFace): refactor operator into per-task codegen + text-generation #5278

Open

PG1204 and others added 3 commits May 28, 2026 13:06

Merge branch 'apache:main' into hf/01-backend-skeleton

6f0f5fb

Merge branch 'main' into hf/01-backend-skeleton

fec6dfb

Merge branch 'apache:main' into hf/01-backend-skeleton

5e95bcd

xuang7 approved these changes May 29, 2026

View reviewed changes

Conversation

PG1204 commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this PR?

Any related issues, documentation, or discussions?

How was this PR tested?

Was this PR authored or co-authored using generative AI tooling?

Uh oh!

PG1204 commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Yicong-Huang commented May 18, 2026

Uh oh!

PG1204 commented May 18, 2026

Uh oh!

Ma77Ball commented May 18, 2026

Uh oh!

Ma77Ball left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

PG1204 commented May 20, 2026

Uh oh!

Ma77Ball left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ma77Ball commented May 27, 2026

Uh oh!

xuang7 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

PG1204 commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xuang7 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

PG1204 commented May 17, 2026 •

edited

Loading

PG1204 commented May 17, 2026 •

edited

Loading

codecov-commenter commented May 17, 2026 •

edited

Loading

Ma77Ball left a comment •

edited

Loading

xuang7 left a comment •

edited

Loading

PG1204 commented May 28, 2026 •

edited

Loading